FAQ
In your search for information about Istio and service mesh technology, we hope this FAQ helps!
General
Istio is an open platform-independent service mesh that provides traffic management, policy enforcement, and telemetry collection.
Open: Istio is being developed and maintained as open-source software. We encourage contributions and feedback from the community at-large.
Platform-independent: Istio is not targeted at any specific deployment environment. During the initial stages of development, Istio will support Kubernetes-based deployments. However, Istio is being built to enable rapid and easy adaptation to other environments.
Service mesh: Istio is designed to manage communications between microservices and applications. Without requiring changes to the underlying services, Istio provides automated baseline traffic resilience, service metrics collection, distributed tracing, traffic encryption, protocol upgrades, and advanced routing functionality for all service-to-service communication.
For more detail, please see The Istio service mesh
Traditionally, much of the logic handled by Istio has been built directly into applications. Across a fleet of services, managing updates to this communications logic can be a large burden. Istio provides an infrastructure-level solution to managing service communications.
Application developers: With Istio managing how traffic flows across their services, developers can focus exclusively on business logic and iterate quickly on new features.
Service operators: Istio enables policy enforcement and mesh monitoring from a single centralized control point, independent of application evolution. As a result, operators can ensure continuous policy compliance through a simplified management plane.
We recommend following the instructions on the getting started page, which installs a demonstration configuration along with Istio’s premier sample application, Bookinfo. You can then use this setup to walk through various Istio guides that showcase intelligent routing, policy enforcement, security, telemetry, etc., in a tutorial style.
To start using Istio with production Kubernetes deployments, please refer to our deployment models documentation and the which Istio installation method should I use? FAQ page.
Istio uses the Apache License 2.0.
The Istio project was started by teams from Google and IBM in partnership with the Envoy team from Lyft. It’s been developed fully in the open on GitHub.
Istio is designed to be platform-independent, initially focused on Kubernetes. For our 1.23 release, Istio supports environments running Kubernetes (1.27, 1.28, 1.29, 1.30).
Contributions are highly welcome. We look forward to community feedback, additions, and bug reports.
The code repositories are hosted on GitHub. Please see our Contribution Guidelines to learn how to contribute.
In addition to the code, there are other ways to contribute to the Istio community, including on our discussion forum, Slack, and Stack Overflow.
Check out the documentation right here on istio.io. The docs include concept overviews, task guides, examples, and the complete reference documentation.
Detailed developer-level documentation is maintained on our Wiki
Check out the operations guide for finding solutions and our bug reporting page for filing bugs.
See our feature stages page and news for latest happenings.
It’s the Greek word for ‘sail’.
If you’d like to have live interactions with members of our community, you can join us on Istio’s Slack workspace.
Setup
In addition to the simple getting started evaluation install, there are several different methods you can use to install Istio. Which one you should use depends on your production requirements. The following lists some of the pros and cons of each of the available methods:
The simplest and most qualified installation and management path with high security. This is the community recommended method for most use cases.
Pros:
- Thorough configuration validation and health verification.
- Uses the
IstioOperator
API which provides extensive configuration/customization options. - No in-cluster privileged pods needed. Changes are actuated by running the
istioctl
command.
Cons:
- Multiple binaries must be managed, one per Istio minor version.
- The
istioctl
command can set values likeJWT_POLICY
based on your running environment, thereby producing varying installations in different Kubernetes environments.
Generate the Kubernetes manifest and then apply with
kubectl apply --prune
. This method is suitable where strict auditing or augmentation of output manifests is needed.Pros:
- Resources are generated from the same
IstioOperator
API as used inistioctl install
and Operator. - Uses the
IstioOperator
API which provides extensive configuration/customization options.
Cons:
- Some checks performed in
istioctl install
and Operator are not done. - UX is less streamlined compared to
istioctl install
. - Error reporting is not as robust as
istioctl install
for the apply step.
- Resources are generated from the same
Using Helm charts allows easy integration with Helm based workflows and automated resource pruning during upgrades.
Pros:
- Familiar approach using industry standard tooling.
- Helm native release and upgrade management.
Cons:
- Fewer checks and validations compared to
istioctl install
and Operator. - Some administrative tasks require more steps and have higher complexity.
The Istio operator provides an installation path without needing the
istioctl
binary. This can be used for simplified upgrade workflows where running an in-cluster privileged controller is not a concern. This method is suitable where strict auditing or augmentation of output manifests is not needed.Pros:
- Same API as
istioctl install
but actuation is through a controller pod in the cluster with a fully declarative operation. - Uses the
IstioOperator
API which provides extensive configuration/customization options. - No need to manage multiple
istioctl
binaries.
Cons:
- High privilege controller running in the cluster poses security risks.
- Same API as
Installation instructions for all of these methods are available on the Istio install page.
Ensure that your cluster has met the
prerequisites for
the automatic sidecar injection. If your microservice is deployed in
kube-system
, kube-public
or istio-system
namespaces, they are exempted
from automatic sidecar injection. Please use a different namespace
instead.
Ambient Mode
Istio’s ztunnel does not introduce a single point of failure (SPOF) into a Kubernetes cluster. Failures of ztunnel are confined to a single node, which is considered a fallible component in a cluster. It behaves the same as other node-critical infrastructure running on every cluster such as the Linux kernel, container runtime, etc. In a properly designed system, node outages do not lead to cluster outages. Learn more.
Security
You can change mutual TLS settings for your services at any time using authentication policy and destination rule. See task for more details.
Authentication policy can be mesh-wide (which affects all services in the mesh), namespace-wide (all services in the same namespace), or service specific. You can have policy or policies to set up mutual TLS for services in a cluster in any way as you want.
If you installed Istio with values.global.proxy.privileged=true
, you can use tcpdump
to determine encryption status. Also in Kubernetes 1.23 and later, as an alternative to installing Istio as privileged, you can use kubectl debug
to run tcpdump
in an ephemeral container. See Istio mutual TLS migration for instructions.
When STRICT
mutual TLS is enabled, non-Istio workloads cannot communicate to Istio services, as they will not have a valid Istio client certificate.
If you need to allow these clients, the mutual TLS mode can be configured to PERMISSIVE
, allowing both plaintext and mutual TLS.
This can be done for individual workloads or the entire mesh.
See Authentication Policy for more details.
If mutual TLS is enabled, HTTP and TCP health checks from the kubelet will not work without modification, since the kubelet does not have Istio-issued certificates.
There are several options:
Using probe rewrite to redirect liveness and readiness requests to the workload directly. Please refer to Probe Rewrite for more information. This is enabled by default and recommended.
Using a separate port for health checks and enabling mutual TLS only on the regular service port. Please refer to Health Checking of Istio Services for more information.
Using the
PERMISSIVE
mode for the workload, so it can accept both plaintext and mutual TLS traffic. Please keep in mind that mutual TLS is not enforced with this option.
For the workloads running in Kubernetes, the lifetime of their Istio certificates is by default 24 hours.
This configuration may be overridden by customizing the proxyMetadata
field of the proxy configuration. For example:
proxyMetadata:
SECRET_TTL: 48h
No. When traffic.sidecar.istio.io/excludeInboundPorts
is used on server workloads, Istio still
configures the client Envoy to send mutual TLS by default. To change that, you need to configure
a Destination Rule with mutual TLS mode set to DISABLE
to have clients send plain text to those
ports.
You may find MySQL can’t connect after installing Istio. This is because MySQL is a server first protocol,
which can interfere with Istio’s protocol detection. In particular, using PERMISSIVE
mTLS mode, may cause issues.
You may see error messages such as ERROR 2013 (HY000): Lost connection to MySQL server at 'reading initial communication packet', system error: 0
.
This can be fixed by ensuring STRICT
or DISABLE
mode is used, or that all clients are configured
to send mTLS. See server first protocols for more information.
Yes. Istio provides authorization features for both HTTP and plain TCP services in the mesh. Learn more.
By following the instructions in the Secure Ingress Traffic task, Istio Ingress can be secured to only accept TLS traffic.
Yes, you can. It works both with mutual TLS enabled and disabled.
Metrics and Logs
You can collect telemetry data about Istio using Prometheus. And then use Prometheus’s HTTP API to query that data.
In-proxy telemetry (aka v2) reduces resource cost and improves proxy performance as compared to the Mixer-based telemetry (aka v1) approach, and is the preferred mechanism for surfacing telemetry in Istio. However, there are few differences in reported telemetry between v1 and v2 which are listed below:
Missing labels for out-of-mesh traffic In-proxy telemetry relies on metadata exchange between Envoy proxies to gather information like peer workload name, namespace and labels. In Mixer-based telemetry this functionality was performed by Mixer as part of combining request attributes with the platform data. This metadata exchange is performed by the Envoy proxies by adding a specific HTTP header for HTTP protocol or augmenting ALPN protocol for TCP protocol as described here. This requires Envoy proxies to be injected at both the client & server workloads, implying that the telemetry reported when one peer is not in the mesh will be missing peer attributes like workload name, namespace and labels. However, if both peers have proxies injected all the labels mentioned here are available in the generated metrics. When the server workload is out of the mesh, server workload metadata is still distributed to client sidecar, causing client side metrics to have server workload metadata labels filled.
TCP metadata exchange requires mTLS TCP metadata exchange relies on the Istio ALPN protocol which requires mutual TLS (mTLS) to be enabled for the Envoy proxies to exchange metadata successfully. This implies that if mTLS is not enabled in your cluster, telemetry for TCP protocol will not include peer information like workload name, namespace and labels.
No mechanism for configuring custom buckets for histogram metrics Mixer-based telemetry supported customizing buckets for histogram type metrics like request duration and TCP byte sizes. In-proxy telemetry has no such available mechanism. Additionally, the buckets available for latency metrics in in-proxy telemetry are in milliseconds as compared to seconds in Mixer-based telemetry. However, more buckets are available by default in in-proxy telemetry for latency metrics at the lower latency levels.
No metric expiration for short-lived metrics Mixer-based telemetry supported metric expiration whereby metrics which were not generated for a configurable amount of time were de-registered for collection by Prometheus. This is useful in scenarios, such as one-off jobs, that generate short-lived metrics. De-registering the metrics prevents reporting of metrics which would no longer change in the future, thereby reducing network traffic and storage in Prometheus. This expiration mechanism is not available in in-proxy telemetry. The workaround for this can be found here.
Short-lived metrics can hamper the performance of Prometheus, as they often are a large source of label cardinality. Cardinality is a measure of the number of unique values for a label. To manage the impact of your short-lived metrics on Prometheus, you must first identify the high cardinality metrics and labels. Prometheus provides cardinality information at its /status
page. Additional information can be retrieved via PromQL.
There are several ways to reduce the cardinality of Istio metrics:
- Disable host header fallback.
The
destination_service
label is one potential source of high-cardinality. The values fordestination_service
default to the host header if the Istio proxy is not able to determine the destination service from other request metadata. If clients are using a variety of host headers, this could result in a large number of values for thedestination_service
. In this case, follow the metric customization guide to disable host header fallback mesh wide. To disable host header fallback for a particular workload or namespace, you need to copy the statsEnvoyFilter
configuration, update it to have host header fallback disabled, and apply it with a more specific selector. This issue has more detail on how to achieve this. - Drop unnecessary labels from collection. If the label with high cardinality is not needed, you can drop it from metric collection via metric customization using
tags_to_remove
. - Normalize label values, either through federation or classification. If the information provided by the label is desired, you can use Prometheus federation or request classification to normalize the label.
Mixer was removed in the 1.8 Istio release. Migration is needed if you still rely on Mixer’s built-in adapters or any out-of-process adapters for mesh extension.
For built-in adapters, several alternatives are provided:
Prometheus
andStackdriver
integrations are implemented as proxy extensions. Customization of telemetry generated by these two extensions can be achieved via request classification and Prometheus metrics customization.- Global and Local Rate-Limiting (
memquota
andredisquota
adapters) functionality is provided through the Envoy-based rate-limiting solution. OPA
adapter is replaced by the Envoy ext-authz based solution, which supports integration with OPA policy agent.
For custom out-of-process adapters, migration to Wasm-based extensions is strongly encouraged. Please refer to the guides on Wasm module development and extension distribution. As a temporary solution, you can enable Envoy ext-authz and gRPC access log API support in Mixer, which allows you to upgrade Istio to post 1.7 versions while still using 1.7 Mixer with out-of-process adapters. This will give you more time to migrate to Wasm-based extensions. Note this temporary solution is not battle-tested and will unlikely get patch fixes, since it is only available on the Istio 1.7 branch which is out of support window after Feb 2021.
You can use docker-compose to install Prometheus.
You can enable tracing to determine the flow of a request in Istio.
Additionally, you can use the following commands to know more about the state of the mesh:
istioctl proxy-config
: Retrieve information about proxy configuration when running in Kubernetes:# Retrieve information about bootstrap configuration for the Envoy instance in the specified pod. $ istioctl proxy-config bootstrap productpage-v1-bb8d5cbc7-k7qbm
Retrieve information about cluster configuration for the Envoy instance in the specified pod.
$ istioctl proxy-config cluster productpage-v1-bb8d5cbc7-k7qbm
Retrieve information about listener configuration for the Envoy instance in the specified pod.
$ istioctl proxy-config listener productpage-v1-bb8d5cbc7-k7qbm
Retrieve information about route configuration for the Envoy instance in the specified pod.
$ istioctl proxy-config route productpage-v1-bb8d5cbc7-k7qbm
Retrieve information about endpoint configuration for the Envoy instance in the specified pod.
$ istioctl proxy-config endpoints productpage-v1-bb8d5cbc7-k7qbm
Try the following to discover more proxy-config commands
$ istioctl proxy-config –help
kubectl get
: Gets information about different resources in the mesh along with routing configuration:# List all virtual services $ kubectl get virtualservices
Yes. Prometheus is an open source monitoring system and time series database. You can use Prometheus with Istio to record metrics that track the health of Istio and of applications within the service mesh. You can visualize metrics using tools like Grafana and Kiali. See Configuration for Prometheus to understand how to enable collection of metrics.
A few notes:
- If the Prometheus pod started before the istiod pod could generate the required certificates and distribute them to Prometheus, the Prometheus pod will need to be restarted in order to collect from mutual TLS-protected targets.
- If your application exposes Prometheus metrics on a dedicated port, that port should be added to the service and deployment specifications.
Distributed Tracing
Istio integrates with distributed tracing systems using Envoy-based tracing. With Envoy-based tracing integration, applications are responsible for forwarding tracing headers for subsequent outgoing requests.
You can find additional information in the Istio Distributed Tracing (Jaeger, Lightstep, Zipkin) Tasks and in the Envoy tracing docs.
Istio enables reporting of trace spans for workload-to-workload communications within a mesh. However, in order for various trace spans to be stitched together for a complete view of the traffic flow, applications must propagate the trace context between incoming and outgoing requests.
In particular, Istio relies on applications to propagate the B3 trace headers, as well as the Envoy-generated request ID. These headers include:
x-request-id
x-b3-traceid
x-b3-spanid
x-b3-parentspanid
x-b3-sampled
x-b3-flags
b3
If you are using Lightstep, you will also need to forward the following headers:
x-ot-span-context
If you are using OpenTelemetry or Stackdriver, you will also need to forward the following headers:
traceparent
tracestate
Header propagation may be accomplished through client libraries, such as Zipkin or Jaeger. It may also be accomplished manually, as documented in the Distributed Tracing Task.
For Envoy-based tracing integrations, Envoy (the sidecar proxy) sends tracing information directly to tracing backends on behalf of the applications being proxied.
Envoy:
- generates request IDs and trace headers (i.e.,
X-B3-TraceId
) for requests as they flow through the proxy - generates trace spans for each request based on request and response metadata (i.e., response time)
- sends the generated trace spans to the tracing backends
- forwards the trace headers to the proxied application
Istio supports the Envoy-based integrations of Lightstep and Zipkin, as well as all Zipkin API-compatible backends, including Jaeger.
The Istio minimal profile with tracing enabled is all that is required for Istio to integrate with Zipkin-compatible backends.
The Istio sidecar proxy (Envoy) generates the initial headers, if they are not provided by the request.
Although an Istio sidecar will process both inbound and outbound requests for an associated application instance, it has no implicit way of correlating the outbound requests to the inbound request that caused them. The only way this correlation can be achieved is if the application propagates relevant information (i.e. headers) from the inbound request to the outbound requests. Header propagation may be accomplished through client libraries or manually. Further discussion is provided in What is required for distributed tracing with Istio?.
Since Istio 1.0.3, the sampling rate for tracing has been reduced to 1% in the default
configuration profile.
This means that only 1 out of 100 trace instances captured by Istio will be reported to the tracing backend.
The sampling rate in the demo
profile is still set to 100%. See
this section
for more information on how to set the sampling rate.
If you still do not see any trace data, please confirm that your ports conform to the Istio port naming conventions and that the appropriate container port is exposed (via pod spec, for example) to enable traffic capture by the sidecar proxy (Envoy).
If you only see trace data associated with the egress proxy, but not the ingress proxy, it may still be related to the Istio port naming conventions. Starting with Istio 1.3 the protocol for outbound traffic is automatically detected.
Istio, via Envoy, currently supports a percentage-based sampling strategy for trace generation. Please see this section for more information on how to set this sampling rate.
If you already have installed Istio with tracing enabled, you can disable it as follows:
# Fill <istio namespace> with the namespace of your istio mesh.Ex: istio-system
TRACING_POD=`kubectl get po -n <istio namespace> | grep istio-tracing | awk '{print $1}'`
$ kubectl delete pod $TRACING_POD -n <istio namespace>
$ kubectl delete services tracing zipkin -n <istio namespace>
# Now, manually remove instances of trace_zipkin_url from the file and save it.
Then follow the steps of the cleanup section of the Distributed Tracing task.
If you don’t want tracing functionality at all, then disable tracing when installing Istio.
To do so, you must you use the fully qualified domain name of the Zipkin-compatible instance. For example:
zipkin.mynamespace.svc.cluster.local
.
Istio does not currently provide support for pub/sub and event bus protocols. Any use of those technologies is best-effort and subject to breakage.
Traffic Management
Rules can be viewed using kubectl get virtualservice -o yaml
Istio captures inbound traffic on all ports by default.
You can override this behavior using the traffic.sidecar.istio.io/includeInboundPorts
pod annotation
to specify an explicit list of ports to capture, or using traffic.sidecar.istio.io/excludeOutboundPorts
to specify a list of ports to bypass.
Both of these DestinationRule
settings will send mutual TLS traffic.
With ISTIO_MUTUAL
, Istio certificates will automatically be used.
For MUTUAL
, the key, certificate, and trusted CA must be configured.
This allows initiating mutual TLS with non-Istio applications.
Yes, Istio fully supports these workloads as of Istio 1.10.
Simple ingress specifications, with host, TLS, and exact path based
matches will work out of the box without the need for route
rules. However, note that the path used in the ingress resource should
not have any .
characters.
For example, the following ingress resource matches requests for the example.com host, with /helloworld as the URL.
$ kubectl create -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: simple-ingress
annotations:
kubernetes.io/ingress.class: istio
spec:
rules:
- host: example.com
http:
paths:
- path: /helloworld
pathType: Prefix
backend:
service:
name: myservice
port:
number: 8000
EOF
However, the following rules will not work because they use regular
expressions in the path and ingress.kubernetes.io
annotations:
$ kubectl create -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: this-will-not-work
annotations:
kubernetes.io/ingress.class: istio
# Ingress annotations other than ingress class will not be honored
ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- host: example.com
http:
paths:
- path: /hello(.*?)world/
pathType: Prefix
backend:
service:
name: myservice
port:
number: 8000
EOF
After applying CORS configuration, you may find that seemingly nothing happened and wonder what went wrong. CORS is a commonly misunderstood HTTP concept that often leads to confusion when configuring.
To understand this, it helps to take a step back and look at what CORS is and when it should be used.
By default, browsers have restrictions on “cross origin” requests initiated by scripts.
This prevents, for example, a website attack.example.com
from making a JavaScript request to bank.example.com
and stealing a users sensitive information.
In order to allow this request, bank.example.com
must allow attack.example.com
to perform cross origin requests.
This is where CORS comes in. If we were serving bank.example.com
in an Istio enabled cluster, we could configure a corsPolicy
to allow this:
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
name: bank
spec:
hosts:
- bank.example.com
http:
- corsPolicy:
allowOrigins:
- exact: https://attack.example.com
...
In this case we explicitly allow a single origin; wildcards are common for non-sensitive pages.
Once we do this, a common mistake is to send a request like curl bank.example.com -H "Origin: https://attack.example.com"
, and expect the request to be rejected.
However, curl and many other clients will not see a rejected request, because CORS is a browser constraint.
The CORS configuration simply adds Access-Control-*
headers in the response; it is up to the client (browser) to reject the request if the response is not satisfactory.
In browsers, this is done by a Preflight request.
Currently, Istio supports TCP based protocols. Additionally, Istio provides functionality such as routing and metrics for other protocols such as http
and mysql
.
For a list of all protocols, and information on how to configure protocols, view the Protocol Selection documentation.